Vejle
Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach
Flyckt, Ricco Noel Hansen, Sjodsholm, Louise, Henriksen, Margrethe Høstgaard Bang, Brasen, Claus Lohman, Ebrahimi, Ali, Hilberg, Ole, Hansen, Torben Frøstrup, Wiil, Uffe Kock, Jensen, Lars Henrik, Peimankar, Abdolrahman
Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses. Effective strategies for early detection are therefore of paramount importance. In recent years, machine learning (ML) has demonstrated considerable potential in healthcare by facilitating the detection of various diseases. In this retrospective development and validation study, we developed an ML model based on dynamic ensemble selection (DES) for LC detection. The model leverages standard blood sample analysis and smoking history data from a large population at risk in Denmark. The study includes all patients examined on suspicion of LC in the Region of Southern Denmark from 2009 to 2018. We validated and compared the predictions by the DES model with diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had complete data of which 2,505 (25\%) had LC. The DES model achieved an area under the roc curve of 0.77$\pm$0.01, sensitivity of 76.2\%$\pm$2.4\%, specificity of 63.8\%$\pm$2.3\%, positive predictive value of 41.6\%$\pm$1.2\%, and F\textsubscript{1}-score of 53.8\%$\pm$1.1\%. The DES model outperformed all five pulmonologists, achieving a sensitivity 9\% higher than their average. The model identified smoking status, age, total calcium levels, neutrophil count, and lactate dehydrogenase as the most important factors for the detection of LC. The results highlight the successful application of the ML approach in detecting LC, surpassing pulmonologists' performance. Incorporating clinical and laboratory data in future risk assessment models can improve decision-making and facilitate timely referrals.
- Europe > Denmark > Southern Denmark > Vejle (0.05)
- North America > United States > Maine (0.04)
- Europe > United Kingdom (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.86)
Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations
Peng, Siyao, Sun, Zihang, Loftus, Sebastian, Plank, Barbara
Named Entity Recognition (NER) is a key information extraction task with a long-standing tradition. While recent studies address and aim to correct annotation errors via re-labeling efforts, little is known about the sources of human label variation, such as text ambiguity, annotation error, or guideline divergence. This is especially the case for high-quality datasets and beyond English CoNLL03. This paper studies disagreements in expert-annotated named entity datasets for three languages: English, Danish, and Bavarian. We show that text ambiguity and artificial guideline changes are dominant factors for diverse annotations among high-quality revisions. We survey student annotations on a subset of difficult entities and substantiate the feasibility and necessity of manifold annotations for understanding named entity ambiguities from a distributional perspective.
- North America > United States > New York (0.05)
- North America > Canada > Ontario > Toronto (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- (21 more...)
Mapping Climate Change Research via Open Repositories & AI: advantages and limitations for an evidence-based R&D policy-making
Bovenzi, Nicandro, Duran-Silva, Nicolau, Massucci, Francesco Alessandro, Multari, Francesco, Parra-Rojas, César, Pujol-Llatse, Josep
In the last few years, several initiatives have been starting to offer access to research outputs data and metadata in an open fashion. The platforms developed by those initiatives are opening up scientific production to the wider public and they can be an invaluable asset for evidence-based policy-making in Science, Technology and Innovation (STI). These resources can indeed facilitate knowledge discovery and help identify available R&D assets and relevant actors within specific research niches of interest. Ideally, to gain a comprehensive view of entire STI ecosystems, the information provided by each of these resources should be combined and analysed accordingly. To ensure so, at least a certain degree of interoperability should be guaranteed across data sources, so that data could be better aggregated and complemented and that evidence provided towards policy-making is more complete and reliable. Here, we study whether this is the case for the case of mapping Climate Action research in the whole Denmark STI ecosystem, by using 4 popular open access STI data sources, namely OpenAire, Open Alex, CORDIS and Kohesio.
- Europe > Denmark > Capital Region > Copenhagen (0.15)
- North America > United States (0.14)
- Europe > Denmark > North Jutland > Aalborg (0.05)
- (3 more...)
- Health & Medicine (1.00)
- Government (1.00)
- Energy > Renewable (0.93)